photographic image
Statistical Modeling of Images with Fields of Gaussian Scale Mixtures
The local statistical properties of photographic images, when represented in a multi-scale basis, have been described using Gaussian scale mixtures (GSMs). Here, we use this local description to construct a global field of Gaussian scale mixtures (FoGSM). We show that parameter estimation for FoGSM is feasible, and that samples drawn from an estimated FoGSM model have marginal and joint statistics similar to wavelet coefficients of photographic images. We develop an algorithm for image denoising based on the FoGSM model, and demonstrate substantial improvements over current state-ofthe-art denoising method based on the local GSM model. Many successful methods in image processing and computer vision rely on statistical models for images, and it is thus of continuing interest to develop improved models, both in terms of their ability to precisely capture image structures, and in terms of their tractability when used in applications.
Why The Creative Economy Shouldn't Fear Generative A.I.
Artificial intelligence is all over the news. When ChatGPT, OpenAI's new chatbot, was released last month it seemed, finally, to match the hype that generative A.I. has been promising for years--an easy-to-use machine intelligence for the general public. Wild predictions soon followed: The death of search engines, the end of homework, the hollowing-out of creative professions. And, for the first time, such predictions didn't seem abstract. When an A.I. bot like ChatGPT can write a coherent story or essay in seconds, and visual applications like Midjourney, Stable Diffusion and DALL-E 2, produce similarly comprehensible images you have to wonder if human creativity--slow and often uncertain--might be superfluous.
RadImageNet: Training AI Models With Radiologic vs. Photographic Images
Yang Yang, PhD, Zahi Fayad, PhD, Xueyan Mei, PhD, Timothy Deyer, PhD and colleagues from Icahn School of Medicine at Mount Sinai, University of Oklahoma, and Weill Cornell Medicine conducted a study to evaluate the performance of AI models pretrained on radiologic images compared to photographic images. They created a large-scale, diverse medical imaging dataset to generate CNNs trained only from radiologic images. This is a significant study because the researchers demonstrated that pretraining with radiologic images rather than photographic images may result in more effective transfer learning for radiology AI models. A paper detailing the study entitled RadImageNet: An Open Radiologic Deep Learning Research Dataset for Effective Transfer Learning was published in RSNA Radiology AI on July 27, 2022. Within 10 days of publication, the paper has been downloaded over 1,000 times.
Application of Computer Vision : Object Classification
Object classification from a photographic image is a complex process and is fast becoming an important task in the field of computer vision. Real-time object classification from images has been used in various fields such as healthcare, manufacturing, retail, etc. Object classification from photographic images is a technique that includes classifying or predicting the class of an object in an image, with a goal to accurately identify the feature in an image. Object classification includes labelling and classifying the images into predefined classes based on the feature/object observed. Object Classification from images is an important application in the domain of Computer Vision and the field involves different techniques and algorithms to acquire, analyse, and process the images. To put it common terms, Object Classification from images is a process of classifying and predicting the class of the objects in an image, with a goal to unambiguously distinguish the feature/object in the image. In general, object classification is an algorithm that takes in a set of features that represent the objects in the image and makes use of the same to predict the class for each object.
Introducing neural supersampling for real-time rendering - Facebook Research
Real-time rendering in virtual reality presents a unique set of challenges -- chief among them being the need to support photorealistic effects, achieve higher resolutions, and reach higher refresh rates than ever before. To address this challenge, researchers at Facebook Reality Labs developed DeepFocus, a rendering system we introduced in December 2018 that uses AI to create ultra-realistic visuals in varifocal headsets. This year at the virtual SIGGRAPH Conference, we're introducing the next chapter of this work, which unlocks a new milestone on our path to create future high-fidelity displays for VR. Our SIGGRAPH technical paper, entitled "Neural Supersampling for Real-time Rendering," introduces a machine learning approach that converts low-resolution input images to high-resolution outputs for real-time rendering. This upsampling process uses neural networks, training on the scene statistics, to restore sharp details while saving the computational overhead of rendering these details directly in real-time applications.
Automated detection of oral pre-cancerous tongue lesions using deep learning for early diagnosis of oral cavity cancer
Shamim, Mohammed Zubair M., Syed, Sadatullah, Shiblee, Mohammad, Usman, Mohammed, Ali, Syed
Discovering oral cavity cancer (OCC) at an early stage is an effective way to increase patient survival rate. However, current initial screening process is done manually and is expensive for the average individual, especially in developing countries worldwide. This problem is further compounded due to the lack of specialists in such areas. Automating the initial screening process using artificial intelligence (AI) to detect pre-cancerous lesions can prove to be an effective and inexpensive technique that would allow patients to be triaged accordingly to receive appropriate clinical management. In this study, we have applied and evaluated the efficacy of six deep convolutional neural network (DCNN) models using transfer learning, for identifying pre-cancerous tongue lesions directly using a small data set of clinically annotated photographic images to diagnose early signs of OCC. DCNN model based on Vgg19 architecture was able to differentiate between benign and pre-cancerous tongue lesions with a mean classification accuracy of 0.98, sensitivity 0.89 and specificity 0.97. Additionally, the ResNet50 DCNN model was able to distinguish between five types of tongue lesions i.e. hairy tongue, fissured tongue, geographic tongue, strawberry tongue and oral hairy leukoplakia with a mean classification accuracy of 0.97. Preliminary results using an (AI+Physician) ensemble model demonstrate that an automated initial screening process of tongue lesions using DCNNs can achieve near-human level classification performance for diagnosing early signs of OCC in patients.
Estimation of Body Mass Index from Photographs using Deep Convolutional Neural Networks
Pantanowitz, Adam, Cohen, Emmanuel, Gradidge, Philippe, Crowther, Nigel, Aharonson, Vered, Rosman, Benjamin, Rubin, David M
Obesity is an important concern in public health, and Body Mass Index is one of the useful (and proliferant) measures. We use Convolutional Neural Networks to determine Body Mass Index from photographs in a study with 161 participants. Low data, a common problem in medicine, is addressed by reducing the information in the photographs by generating silhouette images. Results present with high correlation when tested on unseen data.